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Abstract 

Collaboration is key to scientific research, and increasingly to mathematics. This paper contains a 
longitudinal investigation of mathematics collaboration and publishing using the proprietary database 
Mathematical Reviews, maintained by the American Mathematical Society. The database contains pub- 
lications by several hundred thousand researchers over 25 years. Mathematical scientists became more 
interconnected, collaborative, and interdisciplinary over this interval, and twice the network experienced 
dramatic structural shifts. These events are examined and possible external factors are discussed. Smaller 
subject-specific subnetworks exhibit behavior that provides insight into the aggregate dynamics. The 
data are available upon request to the Executive Director of the AMS. 
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1. Introduction 

Collaboration networks have been studied ex- 
tensively in recent years, thanks to the availabiL 
ity of se v eral exce ll ent datab a ses, e.g 



Gro02, 



iNewOlbl . IB.TN+021 . IFLC+04I . IAOL+OTI . TtL(T7 . 
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PerlOl | . These studies have revealed a diversity of 
topological structure, especially across disciplines, 
depicting typical ranges of basic graph-theoretic 
metrics across real- world networks. While lon- 
gitudinal studies are increasingly common, they 
predominantly take a cumulative approach; they 
observe network growth after a designated start- 
ing year, whic h may the n be compared to evolving 
graph models [BJNj^O^. However, evolving real- 
world networks c an take decades to exhibit clear 
long-term trends |RB10| , and short-term changes 
in structure and behavi or beco me obscured by 
aggregating information [TL07|. To strengthen 
models of scientific research collaboration, cumu- 
lative models must be supplemented by dynamic 
models that captu re the effectiv e relationships 



among researchers TLOTL iKWOq . Furthermore, 
while collaboration networks are often treated in 
the larger context of complex networks, impor- 
tant differences exist betw een soc ial networks and 
other real- world networks |NP03 1 . Most available 
publishing databases are too specialized (by disci- 
pline or region) to exhibit clear long-term trends. 

A specialized theory of evolving social networks 
is therefore required, and is underway. In this 
paper we examine a large, longitudinal collabo- 
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ration network. The American Mathematical So- 
ciety (AMS) maintains the proprietary database 
Mathematical Reviews (MR), and we study this 
database across 1985-2009, during which nearly 
430,000 authors produced nearly 1.6 million pub- 
lications. MR aims to catalogue every mathemat- 
ical sciences publication each year, including both 
print and online jou rnals, books, proceedings, and 
other publications pac97l | . We therefore treat the 
database as a census of the literature; however, 
we caution that the mathematics literature is it- 
self a fraction of the broader scientific literature 
and highly entangled therewitho The network is 
much larger than most studied scientific collabo- 
ration networks, extends over a longer time, and 
is of consistently great size, which will allow us to 
characterize long-term trends and fluctuations. 



2. Materials and Methods 

Our data consist, for each publication, of en- 
coded author IDs, subject classifications from the 
AMS M athematics Subject Classification Scheme 
[Socllj . and the year of publication. While 
authors and publications, taken together, ex- 
hibit a bipartite structure, and bipartite mod- 
els that preserve this stru cture show promise 
BMG0,l[GMY05l . lZWL+0"8J . the larger literature 



Table 1: The MR network over two intervals. 
MR network 1940-2000 1985-2009 



and better-understood statistical toolkit on uni- 
partite models allows us to better contextualize 
our network. We therefore adopt a unipartite 
model. In this model, nodes wi, . . . , w„ correspond 
to authors and links ViVj (m total) indicate coau- 
thorship. Each link ViVj = VjVi receives a (collab- 
oration) weight Wij given b y the nu mber of joint 
publications by Vi and Vj [New04| . The graph 
evolves over time as authors begin and cease pub- 
lishing. 

We investigated the evolving topology of the 
network using several well-understood graph- 
theoretic metrics. To account for the publish- 
ing process underlying this structure while main- 
taining our unipartite perspective, we introduced 
publication-sensitive analogs to the strictly graph- 
theoretic assortativity and clustering coefficients. 
These metrics reveal network properties not cap- 
tured by the originals and may warrant further 
use. 



*Our network not necessarily more bibliographically 
complete or self-contained than previously studied coUab- 
or ation netw orks (such as the Los Alamos preprint archive 
in iNewOlol '): non- mathematician authors may appear on 
mathematics publications no less frequently than physi- 
cists who abstain from online databases collaborate with 
authors who do not. 



years 

papers 

authors 

avg. authors/paper 

avg. papers/author 

coUab. pairs 

avg. no. coauthors 

prop, in largest coinp. 

avg. separation 

global clustering coeff. 

avg. clustering coefl. 

assortativity 



61 25 

1598 1599 

337 429 

1.45 1.75 

6.9 6.5 

496 876 

2.9 4.1 

.62 .75 

7.56 7.31 

.15 .14 

.34 .61 

.12 .069 



Subject classifications within the MR database 
include two-digit prefixes from 01 to 97. We di- 
vided the literature coarsely into "pure" (03-58) 
and "applied" (60-95) subnetworks and for some 
specific analyses into the similarly-sized subclas- 
sifications indicated in Fig. [^^1 

To trace the effective structure of these net- 
works, we used, depending on the metric, 
nonoverlapping intervals of one year or of five 
years or sliding windows of 5 years. The choice 
of 5-year int ervals offers meaningful comparisons 
to [New01b|. In plots, we identify each window 
by its last year; for instance, the year 1997 may 
refer to the interval 1993-7. Because the network 
grows most quickly from 1985 to 1989, and be- 
cause data is not complete in the most recent 
years, we focused mainly on the period 1989-2007. 
The smaller subnetworks fluctuated widely, ob- 
scuring long-term trends, but their behavior illu- 
minates trends in the aggregate by distinguishing 
the disciplines most reflective of, and plausibly 
responsible for, those trends. 

3. Trends in Mathematical PubUshing 

We examined long-term trends exhibited by the 
MR network. We present the publishing data in 
a raw statistical analysis, emphasizing the rela- 
tionship of output rates to coauthorship and to 
multidisciplinarity. 

Table [T] compares our network (all 25 years 
taken t ogether) with the MR network studied in 
|Gro02| . Several differences dctcctible in the ta- 
ble reflect long-term trends discussed below, in- 
cluding increased collaboration (rows 4, 6, 7) and 
greater network connectivity (rows 8, 9, 11). 



^This scheme is imperfect. For instance, much of 60 
(Probability Theory and Stochastic Processes) might be 
classified as pure mathematics, but this would split 60 from 
62 (Statistics). 
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Figure 1: Across nonoverlapping 5- year intervals: (a) 
Numbers of authors with 1, 2, 3, and 4 pubhcations. (b) 
Numbers of publications by 1, 2, 3, and 4 authors, (c) 
Numbers of authors with 0, 1, 2, and 3 coauthors, (d) 
Average number of publications by authors of a paper, as 
a function of the number of authors on the paper. 



3.1. Publishing rates 

We measured publishing rates individually and 
collaboratively. Mathematics researchers have 
grown more numerous and collaborative at accel- 
erating rates, though without becoming steadily 
more prolific (Fig. [T] (a-c)). In fact, in recent 
years highly collaborative projects have involved 
authors less prolific within mathematics, and av- 
erage prolificity has declined (Fig.[T](D) and[3](A- 
c)). While the number of more prolific authors 
has accelerated, it has been outpaced by the num- 
ber of authors of only one publication, as we dis- 
cuss in the supplementary text. These trends 
were starker in the applied network, which housed 
a greater proportion of less prolific authors, re- 
versed its trend from more to less prolific years 
earlier than the pure, and a greater surge in one- 
time authors (Fig. [3] (a-f)). Credit for declin- 
ing average publishing rates therefore rests largely 
with such authors. 

This surge in less prolific authors refiects a ma- 
jor event around 2001 that we will describe fur- 
ther. A closer look reveals another event years 
earlier: a surge in collaborative publishing after 
1995. From the interval 1989-95 to the interval 
1995-2009, rates of 2- to 6-author publications 
rose and rates of 7- and more-author publica- 
tions reversed from decline to rise (Fig. [2] (g)). 
Fluctuations in subject classification assignments 
and in graph-theoretic structure illuminated these 
events, as we discuss in the next section. 

3.2. Multidisciplinarity 

The literature grew steadily more multidisci- 
plinary, except for a brief period of specialization 
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Figure 2: Across 1-year intervals, 1985—2007: (a) Number 
of publications, (b) Number of authors, (c) Number of 
publications across subnetworks, (d) Number of authors 
across subnetworks, (e) Average number of authors per 
publication, (p) Average number of subject classifications 
per publication, (g) For each fixed number of authors 
(1-8), a histogram over years of publications attributed 
to that number of authors. Each histogram is scaled by 
the total number of publications by that number of au- 
thors. We include four subdisciplines: algebra (o; 08-22), 
differential equations (D; 34-35), computer science and in- 
formation (o; 68, 94), and classical physics (A; 70-86). 



near 1995, as measured by the average number of 
secondary subject classifications (s) = {si)i (as i 
ranges across publications). Meanwhile the aver- 
age number of secondary authors per publication 
(a) = (a,;)i (authors beyond the requisite one) in- 
creased monotonically (Fig. [5] (e,f)). While the 
applied network exhibited larger (a) but smaller 
(s) , a regression model reveals a positive relation- 
ship between Si and a^ that is stronger in the more 
multidisciplinary pure network (Fig. [3] (g— i) ) . We 
fit to the combined pure and applied literature the 
linear model 

Si = ao + aifli + ci2Ui + a^QiUi + €i, (1) 

where the indicator Ui takes the value if the 
publication is classified as pure and 1 otherwise. 
The parameter ai is then the effect of a^ in the 
pure network, ai + a^ that of at in the ap- 



plied, and a^ the interaction effect of a,; and 
Ui. This coauthorship-muhidisciphnarity rela- 
tionship weakened over time, but the subnetworks 
grew variably similar and dissimilar over differ- 
ent intervals. Shifts in a^ coincided with the two 
events: the pure and applied networks grew simi- 
lar after the earlier event but dissimilar after the 
second. 

4. Evolution of the Coauthorship Graph 

We adopt a graph-theoretic approach to study 
connectivity, correlations, and clustering in terms 
of coauthorship. We made use of several graph- 
theoretic metrics. We performed calculations on 
largest connected components unless otherwise 
noted, for two reasons: (1) The same fluctuations 
are visible in time series for entire (disconnected) 
graphs, though often subdued. (2) The steadily 
shrinking proportion of nodes outside largest com- 
ponents affects statistics sensitive to the presence 
of isolated authors and to highly connected, in- 
dependent teams (two common forms that small 
connected components take). 

4.I. Individual and network connectivity 

We measured connectivity three principal ways. 
The number ki of coauthors of an author Vi is that 
author's degree, a measure of individual connec- 
tivity. With an increase in average degree comes 
an increase in graph density D = m/^n{n — 1), 
the proportion of possible node-node links that 
are realized. We may also measure global connec- 
tivity by the proportion of nodes subsumed by 
the largest connected component itself. Finally, 
we gain insight into the eflficiency of this connec- 
tivity from the mean node-node separation within 
this component. We adopted the harmonic mean 
separation (i) between pairs of authors defined by 



{iy 



J2e,r'/lnin-l), 
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rather than the arithme tic mea n, to place empha- 
sis on local connections [LM01|. (The metrics are 
nonetheless highly correlated. Taking their resid- 
uals from linear fits each year as ordered pairs 
produces r = .995, though the arithmetic mean 
varies more about its fit.) 

While the average degree (fc) = 2m/n in- 
creased, the rate of increase over 1994-2009 was 
almost double that over 1989-1994, predomi- 
nantly due to applied publications (Fig. H] (c) 
and [3] (d,e)). The change in pace of average de- 
gree after 1994, especially in the applied network, 
is consistent with the surge in collaboration ob- 
served above. Meanwhile, the largest component 



of the aggregate network absorbed greater propor- 
tions of authors, from 37% (1989) to 65% (2009), 
though this trend decelerated. These proportions 
span the typical range for collaboration networks 
aro02L iNewOlbl . IB.TN+OI ITLOTI . IPerlOJ , suggest- 



ing that the proportional rise will continue to 
decelerate as the networks approach a practical 
upper limit on collaboration network cohesion. 
The pure and applied subnetworks conglomerated 
similarly, though they exhibited different mean 
separation, with the applied network consistently 
more dispersed (Fig. [3] (f)). Generally, (£) de- 
creases as D increases, and while residuals from 
linear fits of these metrics exhibited some corre- 
lation (r = —.63), the pure network was consis- 
tently tighter despite the greater density of the 
applied. This implies that the structure of col- 
laboration varies in important ways, among disci- 
plines and over time, and we studied this structure 
through coauthor correlations and clustering. 



4- 2. Correlations among collaborators 

A network is assortative, or exhibits assortative 
mixing, when similar pairs of nodes are preferen- 
tially linked, disassortative when linking is prefer- 
entially dissimilar, and nonassortativc otherwise 
New03l | . The normalized degree correlation coef- 
ficient Tcoi measures assortative mixing by num- 



ber of collaborators. We supplemented TcoI with 
a measure rpub of assortative mixing by number 
of publications. (See the supporting information 
for a formal definition.) 

Collaboration networks arc known to be assor- 
tative by collaborators (.1 < TcoI < .4) but pre- 
vious studi es indi c ate that mathematics networks 
are less so |TL07l . lNew03l | . We also found TcoI to 
be positive but low in the aggregate, pure, and ap- 
plied networks, though some subdisciplines were 
largely nonassortativc (Fig.U] (a,d)). Mathemat- 
ics researchers were more strongly correlated by 
publishing rate (.3 < rpub < -6). The applied net- 
work and subnetworks exhibited stronger correla- 
tions by both metrics, signifying more hierarchical 
organization. 

The events come into sharper focus through 
these correlation coefficients. Around 1995 the 
network shifted from progressively disassorta- 
tive mixing to progressively assortative, mostly 
with respect to collaborators and predominantly 
among applied researchers. After 2001 this trend 
reversed again as coauthors became less corre- 
lated with respect both to collaborators and to 
publications, and in the latter case earlier in the 
pure network. 



(A) 





tD 




(C) ^~^'''**"^^ 


■^ 




'rii**^^^***^-«. 


!S 




^ "•**• 


■^ 




^^^^..^..^ 


in 




^^,^-— — v.^^^^ 















1990 1995 2000 2005 



Figure 3: Across 5-ycar sliding windows, 1985-9 to 2005-9: Average number of (a) publications per author, (b) collaborative 
publications per author, and (c) publications per pair (zeros omitted from each average), (d) Average degree (fc). (e) 
Residuals of (fc) from best linear fits, (p) Harmonic average separation, (g-i) Estimates of ai, ai + 03, and 03 in the 
regression model {TJ. 



^.3. Scale- freeness 

Recently the graph-theoretic statistic s{g) = 
"^kikj, a sum taken over edges of graph g, 
has been used to quantify "scale-freeness" among 
graphs wit h a common (scahng) degree sequence 



LADW05|. This s-metric is greatest where high- 



degree nodes are hnked preferentiahy, producing 
a highly connected "hub-like" core. The metric 

S{g) = '^^^ ~ '''"'" 



(refined in Li07[) normalizes s over the range of s- 
values across graphs of the same degree sequence 
as (7, and therefore has range [0, 1]. S may also 
be interpreted as a similar normalization of TcoI 
across this collection of graphs. 

The S'-mctric is best understood across graphs 
with a power law degree sequence, and while 
the degree sequences of collaboratio n networks 
are not well-modeled by power laws [LADWOa . 



NewOld 
Gro02l 



power- law approximations are popular 



NewOlo l and helpful in distinguishing col 



labora tion netw orks from other categories of net- 
works ASBSOO[. R ecent stu dies ap ply iS* to sev- 
eral m odel networks |LADW05 . THLOalTHLHOT . 
BCOSi but a pplications to social networks are lim- 



ited tHsiQ9|. We observed .48 < 5 < .58. The 
time series for 5* reveals that fluctuations in TcoI 
may be interpreted in the context of gradually di- 
minishing scale-freeness (Fig|4](c)). 

J^.^. Clustering 

Whereas (/c), rcoi, »^pub, and S measure individ- 
ual and pairwise structure, clustering measures 



structure concerning triples. Among triples of au- 
thors a, 6, and c where a and h collaborated and a 
and c collaborated, the (global) clustering coeffi- 
cient C expresses the proportion for whom h and c 
also collaborated. Disassortative graphs permit a 
reduced number of possible triangles among nodes 
of different degrees, and thus admit a smaller 
range of values for C. This may be gl obally ac - 
counted for using relative probabilities |New01a| . 
which we consider in the supplementary text, but 
we principally adopted a clustering coeffic ient C 
designed to correct for this locally |SV05| . The 
aggregate network showed low clustering, .22 < 
C < .27, compared to other coau thorship graphs 
(NewOld . IBJN+02I ItLOTI . |PerlOl | . The correction 
doubled the range to .46 < C < .53, as observed 
in other networks [SV05[- 

We also introduced an exclusive clustering co- 
efficient: Among triples of authors where a and b 
collaborated without c, a and c collaborated with- 
out b, and both b and c published at least twice, 
Cx is the proportion for whom b and c collabo- 
rated without a. Cx detects changes in coauthor- 
ship that cannot be explained by team collabo- 
ration, and distinct pairwise publications suggest 
stronger, transitive relationships than single com- 
mon publicationsO 

Locally, the clustering coefficient Ci of an au- 
thor Vi is the proportion of the ki{ki — l)/2 pairs 



^While clustering coefRcients have been introduced 
for bipartite author-publication graphs |RA04| . IZWL+Oal , 
they do not address this issue directly. 
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Figure 4: Across 5-ycar sliding windows, 1985-9 to 2005-9: (a) Assortative mixing by number of coauthors, r^oi- (b) 
Assortative mixing by number of publications, rpu^. (c) The S-metric. (d) r^^i across subdisciplines. (e) Tput, across 
subdisciplines. (f) S across subdisciplines. (g) Global clustering C corrected for degree assortativity: (h) Global clustering 
Cx based on pairwise exclusive publications. (l) Average local clustering corrected for degree assortativity across authors 
of fixed degree 4, 7, and 10. 



of their collaborators who have themselves col- 
labor ated. Again we adopted the correction Cj 
from jSVOa |. We used the network- wide aver- 
age (c) in our change point analysis (Table [2|), 
and we stratified authors by degree in Fig. S] (i) 
to compare clustering across differently-connected 
researchers. 

We found patterns of clustering to reaffirm that 
the two events were driven by larger teams of col- 
laborators. While Cx < C < C by definition, 
in our network Cx was comparatively tiny, with 
.0041 < Cx < .0051 (Fig. a (g,h)). This in- 
dicates that highly collaborative projects drove 
overall clustering behavior. Clustering increased 
after 1995, at both local and global scales and 
by both graph-theoretic and exclusive definitions. 
In particular, better-connected authors exhibited 
greater clustering earlier than less-connected au- 
thors. After 2001, however, graph-theoretic clus- 
tering surged while exclusive clustering plum- 
meted (Fig. 12 (g-i)). After 2001 the pure 
and applied networks grew increasingly dissimilar, 
with greater graph-theoretic clustering in the ap- 
plied but greater exclusive clustering in the pure. 
This suggests a connection to the more promi- 
nent disassortative mixing in the applied network, 
and indeed the propagation of highly collabora- 
tive projects by disassortative short-lived research 
teams would explain both dissimilarities. Fur- 
thermore, Cx mimicked collaboration weight (w) 



and rpub, suggesting that autonomous collabora- 
tions are better forged among similarly prolific au- 
thors. 



5. Events and Change Point Models 

While long-term trends varied widely, fluctu- 
ations in our metrics, as revealed by residu- 
als from linear fits, were often highly correlated 
(Fig. [5]). Similar fluctuations suggest mathemati- 
cal or sociological dependencies among properties; 
we grouped together metrics with strongly corre- 
lated time series and identify these groups by sym- 
bol in TableJE]and Fig.|n](c,D). We used a change 
point modeij to arrange these shifts chronologi- 
cally. 

Our change point model flts a continuous, 
piecewise-linear curve with one corner to a set 
of ordered pairs {xi,yi), subject to error from 
a fixed distribution, analogously to a linear fit. 
The model takes the slopes, intercept, and change 
point to be unknown and the errors to come from 
a normal distribution with unknown variance: 



Ui = l3o + f3iXi + l32{xi~c)Sx^yc + ( 



N{0,a) 



^An recent bibliography of chan ge point problems by 
Khodadadi and Asgharian IKAOSH traces change point 
models to a 1954 discussion by Page [Pag54| on piecewise 
continuous models. 








Table 2: Change points in time series around both events, 
statistic (symbol groups as in Fig. ^ change 
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Figure 5: Correlation matrix of time series of key statisties 
across 5-year sliding windows. Top/right to bottom/left: 
p, no. publications; k, (A;); a, (a); n, n; m, m; C, C; c, (c); 
Ss, avg. no. subject classifications per author; s, (s); w, 
{w); X, Cx ; rp, rpub; z, avg. no. publications per author; 
Z, avg. no. coUab. publications per coauthor; K, (k); S, 
5; re, r-coi; a3, ag. 



The indicator Sx>c takes the value 1 when x > c 
and otherwise. The parameters /3o,/3i,/32, c en- 
code the two slopes /3i and f3i+^2, the y-intercept 
/?o, and the change point c. Our code in R uses 
iterative methods to find estimators for the pa- 
rameters that minimize SSE = ^^f-i^- We opti- 
mized this model over intervals visually centered 
about the dramatic shift of each time series to ob- 
tain the dates in Table [2] We used intervals of 1 1 
years when possible, shortening to 10 years in case 
our algorithm failed, for consistency with sliding 
windows. We exhibit code and all change point 
fits to time series in the supplementary materials. 
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Figure 6: Across 5-year sliding windows, 1985—9 to 2005-9: 
(a) Residuals from best linear fit of the number of pub- 
lications, with change point fits about the mid-90s event 
overlaid, (b) Residuals from best linear fit of the num- 
ber of authors, with change point fit about the mid-2000s 
event overlaid, (c-d) Differences in best-fit change points 
from the aggregate network to the few-author network at 
the both events. 



avg. coUab. publications/coauthor 
correlation by no. coauthors (rcoi) 
global SV-clustering coeff. (C) 
scale-freeness [S) 
avg. no. coauthors ((fc)) 

in coUab.-multidisc. effects (03) 
publications/author 
no. collaborations ((k)) 
SV-clustering coeff. ((c)) 
authors/publication ((a)) 
publications 



diff. 

avg. 

avg. 

avg. 

avg 

no. 

no. collab. pairs (m) 

no. authors (n) 

correlation by no. publications (rpub) 



1992.13 
1993.13 
1993.17 
1993.49 
1994.13 
1994.17 
1994.30 
1995.04 
1995.26 
1995.65 
1996.16 
1996.17 
1996.19 
1996.30 



correlation by no. coauthors (rcoi) 1997.37 

diff. in collab. -multidisc. effects (03) 1998.35 

correlation by no. publications (rpub) 2000.76 

global exclusive clustering coeff. (Cx) 2001.08 

avg. collab. publications/coauthor 2001.65 

global SV-clustering coeff. (C) 2001.74 

avg. publications/author 2001.74 

avg. collab. weight {{w)) 2002.06 

avg. SV-clustering coeff. ((c)) 2002.22 

scale-freeness {S) 2002.40 

avg. authors/publication ((a)) 2002.51 

avg. no. collaborations ((k)) 2002.59 

no. collab. pairs (m) 2002.72 

no. authors (n) 2003.65 



We also contrasted the aggregate network with 
a "few-author" network constructed from publi- 
cations of 6 authors or fewer, which would be 
unaffected by the reversals of trends exhibited in 
Fig. m (c). For uniformity in our change point 
analysis we drew all statistics from largest com- 
ponents, so time series for many statistics differ 
from those presented earlier. (In Table[5J the mul- 
tidegree of a node Vi in a weighted graph is the sum 
Ki of the weights of its links; we then say that au- 
thor Vi has engaged in k,; "collaborations".) The 
few-author network exhibited fluctuations similar 
to, but not always simultaneous with, those of the 
aggregate. By several metrics it experienced the 
first event later than the aggregate but the second 
event at essentially the same time (Fig. [S] (c,d)). 
This suggests that highly collaborative projects 
were inceptive to the first event while not neces- 
sarily to the second. 

The chronologies of both events, as arranged 
in Table [2l suggest "top-down" narratives, with 
shifts in hierarchical metrics sensitive to highly 
central or prolific authors preceding shifts in met- 
rics of local connectivity, and shifts in network- 
wide averages and totals manifesting last. 



6. Discussion 

Over 25 years the mathematics collaboration 
network grew steadily larger, more collaborative, 
and better-connected both locally and globally. 
While the applied network was better connected 
locally ((a), (k), (c)) and exhibited more hierar- 
chical structure (tcoI, ''pub, S), the pure network 
was better connected globally ((€}) and exhibited 
stronger local connections {{w), (s), as). In par- 
ticular, while the small-world properties of low 
mean separation and high clustering have been re- 
produced togeth er by a v a riety of real-world and 
model networks JASBSOdj JWS98I . I.TacOSl iBMinj . 
neither of our major subnetworks is clearly the 
superior "small world" of the two. 

The mid-90s event was characterized by prolif- 
erated and strengthened coUaboration (Fig. [21(e), 
(fc). Fig. m (g,h)), a weakening relationship 
between collaboration and multidisciplinarity 
(Fig. [3] (g-i), and moderately increased assorta- 
tive mixing (Fig. |4] (a,c)). The rise in several- 
author publications explains the stabilization of 
clustering; exclusive clustering had already been 
rising (Fig. 2] (g,h)). However, increases in clus- 
tering and hierarchical metrics were still evident 
in the network constructed from 6- or fewer- 
author publications. The delay in shifts from 
the aggregate to this few-author network indicates 
that highly collaborative projects were inceptive 
to the event (Fig.|n](c)), a proposition supported 
by the "top-down" progression of change ponits. 

These qualities of the event, the similar be- 
havior of the pure and applied disciplines, and 
timing suggest a possible factor: the rise of 
e-communications and the World Wide Web. 
Among academic Internet milestones are the in- 
troductions of the a rXiv in 1991, which went on- 
line in 1993 JGin09t . and of MathSciNet in 1996, 
which made the MR publishing data base ay ailable 
through a graphical web interface |Jac97| . We 
should expect researchers in more applied sub- 
disciplines, who historically made greater use of 
computing resources, to have made quicker use of 
these tools, and indeed the applied network and 
its subdisciplines exhibited the above trends more 
clearly (Fig. H] (a-f)). 

The early-2000s event tells a dissimilar story. 
This event was characterized by weakening aver- 
age publishing rates and collaboration strength 
(Fig. [3] (a-c) andg] (h)) due in part to an influx 
of less prolific authors (Fig. [T](d) and[2](D)) and 
dramatically disassortativc mixing (Fig.|4](A-c)). 
While disassortativity was ubiquitous, lower pub- 
lishing rates were more evident in applied disci- 
plines. Increased clustering was largely explained 
by a further acceleration in several-author pub- 



lications (Fig. m (g,h)). Highly collaborative 
projects were not so inceptive (Fig. [HI (d))- 

The growth in the research community and in- 
terconnections within it, simultaneous with weak- 
ening average publishing rates and collaboration 
rates, may be largely explained by the surge in 
transient authors. This surge may reflect an in- 
creasing trend toward interdisciplinary research 
involving many researchers outside mathematics 
who publish seldom but in larger teams. This 
is consistent with the absence of a specializa- 
tion trend during this event, which distinguishes 
it from the earlier event (Fig. [2] (f)). A pos- 
sible contributing factor to such a trend would 
have been an increased emphasis on interdisci- 
plinary projects at funding agencies such the Na- 
tional Science Foundation, the largest funder of 
U.S. mathematics research. We note that the 
event was concurrent with increased funding by 
the NSF for its Division of Mathematical Sciences 



[nsf 1 ^ recommended by a 1998 report [Odo98|. 
(See the supplementary text for detailed discus- 
sion.) A change point fit to 5-year funding aver- 
ages places the surge at 2001.33, toward the be- 
ginning of the event (Table [5]). NSF funding af- 
fects almost exclusively U.S. -based research, how- 
ever, while the MR database covers worldwide 
output. 

7. Conclusions 

The community of researchers in mathemat- 
ical sciences has grown at an increasing rate 
since 1985, and their research output has accel- 
erated. Amidst this growth the literature has be- 
come increasingly multidisciplinary and the net- 
work of researchers has grown better-connected 
and individual researchers more collaborative. In- 
creased collaboration has been due in large part 
to highly collaborative teams of researchers, many 
of whose members have short mathematical pub- 
lishing histories. Such disassortativc authorship 
has been more prevalent in applied disciplines, 
which nonetheless exhibit more hierarchical or- 
ganization, while researchers in more pure disci- 
plines maintain longer collaborations and are less 
separated by degrees of coauthorship. The net- 
work drastically reorganized twice between 1985 
and 2009, in different ways that suggest dissimilar 
causes and consequences. 

The MR network is huge and admits much more 
analysis than we have performed. Data collected 
since 1940 are being processed and will be released 
soon, which will allow investigators to treat the 
database from conception. We omitted discus- 
sion of linking mechanisms, and of a range of tools 
for detecting community structure, for which the 



MR network holds great potential. We suggested 
possible partial explanations for the two events 
we described, but it is beyond the scope of this 
paper to consider these hypotheses thoroughly. 
More detailed information on mathematical pub- 
lishing and its funding may be obtained from the 
MR database and from government agencies, and 
follow-up investigations may provide deeper in- 
sights into these possible connections. 
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8. Supporting Information 

We performed calculations in R, including 
graph-theoretic calculations using the igraph 
package and some original code. All code is avail- 
able upon request from the first author. 

8.1. Publishing rates and connectivity 

As discussed in the main text, one-time au- 
thors increasingly worked in large research teams 
(Fig. [ST] (a)). The also comprised an increas- 
ing proportion of the community after 2000 
(Fig. [ST] (d)), while the proportion consisting of 
more prolific authors (3 or more publications) de- 
creased (Fig. [ST] (c)). 

Network density (Fig. [ST] (b)) is directly re- 
lated to average degree. The relevance of the 
pure-applied split is evident in the higher density 
in both, indicating stronger connectivity among 
pure researchers and applied researchers sepa- 
rately than among all mathematics researchers. 
The fluctuations, especially in the applied net- 
work, were similar to those in rcoi- The aggre- 
gate, pure, and applied networks exhibited similar 
cohesion with respect to largest connected com- 
ponents (Fig. [5T] (c)), and the "S" -shape of the 
curves (especially that of the applied) suggests 
that this proportion is approaching its practical 
limit. 



8.2. Assortative mixing by publications 



Newman |New03l | defines a normalized degree 
correlation coefficient by way of "remaining de- 
gree": Starting with a pair of nodes {vi,Vj), take 
the number of neighbors of each excluding the 
other, (ki — l,fcj — 1). These are their remain- 
ing degrees. We define rp^b analogously using the 
notion of remaining prolificity. 

Consider authors Vi and Vj who have authored 
Zi and Zj publications, respectively, and have col- 
laborated on Wij of them. Zi is then the "pro- 
lificity" of Vi (and Zj that of Vj ) while Wij is the 
"collaboration weight" of Vi and Vj together. De- 
fine the remaining prolificity of Vi with respect to 
Vj to be Zi — Wij , the number of publications by Vi 
not coauthored with Vj. Since in graph-theoretic 
language we say that Vi is adjacent to the link 
{vi ,Vj), we refer to the remaining prolificity of 
the adjacency of node Vi to link {vi ,Vj). 

Where the network includes n^ authors of pro- 
lificity X, set Px = JT-x/ X^x' ""2;', the proportion 
of nodes in the network of prolificity x. Now 
consider the adjacencies: They number twice as 
many as the number of links. If we let Px,w be the 
proportion of adjacencies with author (node) pro- 
lificity X and collaboration (link) weight w then 
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Figure SI: Across 5-year sliding windows: (a) Average number of authors on publications by one-time authors. One-time 
authors appeared on increasingly collaborative publications throughout our interval, (b) Network density. The applied 
network grew dramatically sparser after 1995 while the pure grew denser until 2006. (c) Proportion of nodes in the largest 
component. There is no difference between the pure and applied subnetworks to be discerned from the proportions of 
authors comprising their largest components, and the proportion comprising the largest component of the aggregate is 
consistently larger than both, (d-f) Proportions of authors with 1, 2, and 3 publications. 



Qr ~ J2w>iPw+r,w IS the proportion of all adja- 
cencies having remaining prolificity r. 

Let us have ordered pairs (?', s) range over the 
remaining prolificitics of linked nodes, so that 
each link is counted twice (as (r, s) and as (s, r)). 
Our statistic of interest is then 



'"pub 



E{rs) - E{r)E{s) 
yJVar{r)Var(s) 



the correlation coefficient for the remaining pro- 
lificities of linked nodes. 

Define Cr^s to be the joint probability distribu- 
tion of the remaining prolificitics at the ends of 
a uniformly randomly chosen link. Since r and 
s are drawn from the same distribution, we may 
simplify the numerator as 

E{rs) = E{r)E{s) = E{rs) - E{rf 

r s r 

and the denominator as 



^Var{r)Varis) = Var{r) = E{r^) - E{rf 



J2^^'1r 



If we index the links by i = 1, . . . , to and (ar- 
bitrarily) label the remaining prolificitics of their 
ends ri and Si then we may rewrite 

r s i 

r i 



This provides the computational formula 

m ^ — ^ m ^ — ^ / 



^pub 



^i:>.^+-')-(^i:i(n +..»'' 



If the remaining prolificitics of linked nodes are 
independent then e^^s = QrQs- If, instead, linked 
pairs are perfectly correlated in this respect then 
we get Cr.s = qr^r.s, whcrc 6r^s is the Kronccker 
delta (1 if r = s, otherwise). The authors of 
a collaboration network are perfectly correlated 
by remaining prolificitics r precisely when they 
are precisely correlated by prolificity x — that is, 
when the network consists of connected compo- 
nents of uniform prolificity. 

8.3. Assortative mixing with low-count authors 
removed 

To check that the trends and fluctuations we 
observed in rcoi and in rpub were not artifacts of 
the mixing behavior of authors with only one col- 
laborator or publication, we ran the calculations 
on the aggregate with such authors removed from 
consideration. The overall trends were the same 
(Fig. El. 

8.4. Clustering coefficients 

The time series for C (Fig. [SSKa)) and Tcoi are 
similar. The dependence between these statistics 
reflects the reduced number of possible triangles 
among nodes of different degrees, whi ch adm its 
less clustering in disassortative graphs jSVOSf . In 
the main paper we accounted for this interaction 
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Figure S2: Across 5-year sliding windows on the aggregate 
network; (a) r^oi calculated after removing authors with 
only one collaborator from consideration, (b) rput, cal- 
culated after removing authors with only one publication 
from consideration. 





Figure S3: Across 5-year sliding windows: (a) Global clus- 
tering coefficient C: the proportion of connected triples 
of authors who are in fact pairwise linked. The time se- 
ries closely resembles that of degree assortativity, partic- 
ularly before 2001. (b) Clustering normalized by density 
(Fig. ISlf p)). its expected value in a uniformly random 
graph, C ■ n{n — l)/2m; the relative probability that two 
authors collaborated provided they had a common coau- 
thor. The rises in density and in assortative mixing from 
1995 to 2000 were largely to credit for the perceived rise in 
clustering during this period, while clustering after 2001 
becomes more pronounced when corrected for these phe- 
nomena. Plots (b) and (c) use information from the entire 
graph. 



using the corre ction C introduced by Soffer and 
Vazquez (SVOSt - 

Under uniformly random linking, a higher pro- 
portion of connected triples will form triangles in 
a denser graph, increasing clustering. To account 
for this, we normalized C by density (Fig. [SSKd)) 
to get the relative probability that two authors 
have collaborat ed provid ed that they have a com- 
mon coauthor [NewOlat (Fig. ISSlfc)). Because 
density decreased substantially due to propor- 
tional growth in the largest component, this nor- 
malization increased monotonically across largest 
components; we took the normalization over en- 
tire graphs instead. 

8. 5. Change point fits 

We fit change point models to time series data 
that were not clearly piecewise linear, but that 
exhibited one or two major changes in behavior 
amid smaller perturbations that the models inter- 
pret as normally-distributed error. Here we dis- 
cuss change point models in more detail, and we 



display all aggregate residual plots, together with 
change point fits, used for the analysis. 

The principle behind change point models is the 
same as that behind linear models. A traditional 
linear fit takes the form 

Vi ^ l3o + PiXi + Si, a'^ N{0,a), 

while our change point model takes the form 

Ui = l3o+l3iXi+l32{xi-c)S^,^yc + ii, Q '■^ N{0,a). 

We caution that some basic statistical assump- 
tions for change point models are not met by this 
data: Particularly because adjacent sliding win- 
dows share 4 out of their 5 years but also because 
most authors publish in multiple years, measure- 
ments performed on these windows cannot be con- 
sidered independent. Because each window con- 
tains a different (increasing) number of publica- 
tions, they cannot be considered identically dis- 
tributed. By performing change point analysis 
we do not intend to make predictions of future 
behavior but only to take advantage of an effec- 
tive method for identifying shifts in behavior oth- 
erwise well-modeled linearly. 

What follows is a simplification of the code we 
used to perform change point analysis in R. We 
required a guess c at the change point and cal- 
culated estimators for the coefficients by fitting 
a linear model 1ml to the data below c (provid- 
ing (3q and /3i) and a linear model lin2 with fixed 
intercept at (c,lml(c)) to the data above c (pro- 
viding $2)- 

# FUNCTION: Change point smalysis on a 

# collection of ordered pairs 
changepoint .model <- functionC 

X, # points (independent), sorted 

y, # values (dependent) 

c # number in range (x) 

) { 

len <- length (x) 

stopifnot (len == length(y)) 

# Linear model to estimate bO and bl 
m <- max(which(x < c)) 

1ml <- lm(y[l:m] ~ x[l:m]) 
bO <- lml$coeff[l] 
bl <- lml$coeff[2] 

# Scaling model to estimate b2 

# y-value at x = c 

int <- lml$coeff [1] + lml$coef f [2] * c 

# x-values with origin (c,int) 
x2 <- x[(m + 1) :len] - c 

# y-values with origin (c,int) 
y2 <- y[(m + l):len] - int 
lm2 <- lm(y2 ~ x2 + 0) 

b2 <- lm2$coeff[l] - lml$coef f [2] 
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# Change point model using estimators for 

# c (given), bO, bl, and b2 
return ( summary (nls ( 

as . formulae 

'y ~ BO + Bl * X + 
B2 * (x - C) * (x >= C) ' 

), 

start = list( 

C = c, BO = bO, Bl = bl, B2 = b2 
) 
))) 



> 



Fig. [S4HS71 depict the change point fits we used 
to examine the two events in the main paper, with 
the exception of Fig. [S4lfQ): the fit, we judged, 
was too poor to warrant inclusion, and it demon- 
strates by comparison the superior fits obtained 
in other cases. In each plot the dotted vertical 
lines demarcate the intervals used to construct the 
model. 



8. 6. NSF funding for mathematics 



The 1998 Odom Report [Odo98| recommended 
steep increases in funding for mathematics re- 
search, and from 2001 to 2004 annual NSF fund- 
ing for its Division of Mathematical Sciences rose 
dramatically (Fig. [58)) . A change point fit to these 
numbers over 1995-2004 identifies a change year 
of 2000.36. Using 5-year averages instead, with 
each interval identified by its last year following 
the pattern used for other statistics, a change 
point fit over 1996-2006 identifies the change year 
2001.33. Both values are toward the beginning of 
the collection of change years identified for net- 
work statistics, supporting a causal hypothesis, 
but significantly later than several specific statis- 
tics, suggesting that increased NSF funding may 
have contributed to, but was not the sole driver 
of, the second event. 




1995 2000 2005 



Figure S8: Funding by the National Science Foundation's 
Division of Mathematical Sciences, 1985—2007. A surge in 
funding beginning in 2001 was concurrent with the second 
event, specifically the surge in authorship, and leveled off 
after 2005. 
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